Finding Your Way through Blogspace: Using Semantics for Cross-Domain Blog Analysis

نویسندگان

  • Bettina Berendt
  • Roberto Navigli
چکیده

Blogspace is one of the most dynamic areas of today’s Internet, and it is increasingly recognised that blogs are much more than “meaningless chatter”. Many syntaxbased approaches exist to analyse the text and the network structure between blogs. While this is very helpful for purposes such as the detection of discussion bursts concerning uniquely-named topics (e.g., a book, product, or person), it is insufficient for understanding blogs discussing new phenomena in different wordings, or for finding and explaining relationships between new discourse topics or the context of a new topic in a larger domain of discourse. In this paper, we propose two methods for semantics-enhanced blogs analysis that allow the analyst to integrate domain-specific as well as general background knowledge. The methods rely on the Term Extractor for identifying keyphrases (Navigli & Velardi, 2004), SSI (Structural Semantic Interconnections) for disambiguating terms (Navigli & Velardi, 2005), and the taxonomy of domain labels by (Magnini & Cavaglià, 2000). Applications include topic detection and grouping, the proposal of blog tags and the forming of blog directories, and blog recommender systems. To illustrate the usefulness of our approach, we present a detailed experimental analysis of a sample of four sets of blogs with different thematic foci (food, health, law, and weblogs about blogging).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bloggers during the London attacks: Top information sources and topics

Blogs are probably most associated with the high profile postings of a few highly popular bloggers who debate or comment on major news stories, but for each ‘A-lister’ there are numerous faceless bloggers who write about their own daily lives and/or interests. Hence it is interesting to investigate the extent to which an event with extensive media coverage, such as the London attacks, is reflec...

متن کامل

The Domain of the semantics of ‘promise’ in the Holy Quran

Semantics is a part of linguistic by which it can be analyzed the meaning of the words and sentences of a text and identified the part of speech with regard to semantics. This is a descriptive-analytic research and it deals with studying the meaning of ‘promise’ in the Holy Quran based on principles of semantics with a collocation approach by library methodology. Also, by virtue of ...

متن کامل

Traffic Characteristics and Communication Patterns in Blogosphere

We present a thorough characterization of the access patterns in blogspace – a fast-growing constituent of the content available through the Internet – which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and admin...

متن کامل

BlogGrid: Towards an Efficient Information Pushing Service on Blogspace

With increasing concerns about the personalized information space, users have been posting various types of information on their own blogs. Due to the domain-specific properties of blogging systems, however, searching relevant information is too difficult. In this paper, we focus on analyzing the user behaviors on blogspace, so that the channel between two similar users can be virtually generat...

متن کامل

MoodViews: Tools for Blog Mood Analysis

We demonstrate a system for tracking and analyzing moods of bloggers worldwide, as reflected in the largest blogging community, LiveJournal. Our system collects thousands of blog posts every hour, performs various analyses on the posts and presents the results graphically. Exploring the Blogspace From the point of view of information access, the blogspace offers many natural opportunities beyon...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006